97 research outputs found

    A constructive multi-way circuit partitioning algorithm based on minimum degree ordering

    Get PDF
    Ankara : The Department of Computer Engineering and Information Science and the Institute of Engineering and Science of Bilkent Univ., 1994.Thesis (Master's) -- Bilkent University, 1994.Includes bibliographical references leaves 52-54.Circuit partitioning has many important applications in VLSI. Circuit partitioning problem can be most properly modeled as hypergraph partitioning. In this work, we propose a novel k-v/ay hypergraph partitioning heuristic using the Minimum Degree (MD) ordering which is a well-known heuristic for reducing the amount of fills in the factorization of symmetric sparse matrices. The proposed algorithm operates on the dual graph of the given hypergraph. The algorithm grows node-clusters on the dual graph which induce cell-clusters with locally minimum net-cut sizes. The quotient graph concept, widely used in MD ordering, is exploited for the sake of efficient implementation. The proposed algorithm outperforms well-known heuristics, such as Kernighan-Lin (KL) based algorithms and Simulated Annealing, in terms of solution quality on various VLSI benchmark circuits. A nice property of the proposed algorithm is that its execution time reduces with increasing k as opposed to the existing iterative heuristics. It is even faster than the fast KL-based algorithms on the partitioning of the benchmark circuits for k > 16.Çatalyürek, Ümit VM.S

    Fast and high quality topology-aware task mapping

    Get PDF
    Considering the large number of processors and the size of the interconnection networks on exascale capable supercomputers, mapping concurrently executable and communicating tasks of an application is a complex problem that needs to be dealt with care. For parallel applications, the communication overhead can be a significant bottleneck on scalability. Topology-aware task-mapping methods that map the tasks to the processors (i.e., cores) by exploiting the underlying network information are very effective to avoid, or at worst bend, this limitation. We propose novel, efficient, and effective task mapping algorithms employing a graph model. The experiments show that the methods are faster than the existing approaches proposed for the same task, and on 4096 processors, the algorithms improve the communication hops and link contentions by 16% and 32%, respectively, on the average. In addition, they improve the average execution time of a parallel SpMV kernel and a communication-only application by 9% and 14%, respectively

    Cooperative Minibatching in Graph Neural Networks

    Full text link
    Significant computational resources are required to train Graph Neural Networks (GNNs) at a large scale, and the process is highly data-intensive. One of the most effective ways to reduce resource requirements is minibatch training coupled with graph sampling. GNNs have the unique property that items in a minibatch have overlapping data. However, the commonly implemented Independent Minibatching approach assigns each Processing Element (PE) its own minibatch to process, leading to duplicated computations and input data access across PEs. This amplifies the Neighborhood Explosion Phenomenon (NEP), which is the main bottleneck limiting scaling. To reduce the effects of NEP in the multi-PE setting, we propose a new approach called Cooperative Minibatching. Our approach capitalizes on the fact that the size of the sampled subgraph is a concave function of the batch size, leading to significant reductions in the amount of work per seed vertex as batch sizes increase. Hence, it is favorable for processors equipped with a fast interconnect to work on a large minibatch together as a single larger processor, instead of working on separate smaller minibatches, even though global batch size is identical. We also show how to take advantage of the same phenomenon in serial execution by generating dependent consecutive minibatches. Our experimental evaluations show up to 4x bandwidth savings for fetching vertex embeddings, by simply increasing this dependency without harming model convergence. Combining our proposed approaches, we achieve up to 64% speedup over Independent Minibatching on single-node multi-GPU systems.Comment: Under submissio

    MICA: microRNA integration for active module discovery

    Get PDF
    A successful method to address disease-specific module discovery is the integration of the gene expression data with the protein-protein interaction~(PPI) network. Although many algorithms have been developed for this purpose, they focus only on the network genes~(mostly on the well-connected ones); totally neglecting the genes whose interactions are partially or totally not known. In addition, they only make use of the gene expression data which does not give the complete picture about the actual protein expression levels. The cell uses different mechanisms, such as microRNAs, to post-transcriptionally regulate the proteins without affecting the corresponding genes' expressions. Due to this complexity, using a single data type is definitely not the correct way to find the correct module(s). Today, the unprecedented amount of publicly available disease-related heterogeneous data encourages the development of new methodologies to better understand complex diseases. In this work, we propose a novel workflow Mica, which, to the best of our knowledge, is the first study integrating miRNA, mRNA, and PPI information to identify disease-specific gene modules. The novelty of the Mica lies in many directions, such as the early modification of mRNA expression with microRNA to better highlight the indirect dependencies between the genes. We applied Mica on microRNA-Seq and mRNA-Seq data sets of 699699 invasive ductal carcinoma samples and 150150 invasive lobular carcinoma samples from the Cancer Genome Atlas Project~(TCGA). The Mica modules are shown to unravel new and interesting dependencies between the genes. Additionally, the modules accurately differentiate between the case and control samples while being highly enriched with disease-specific pathways and genes
    corecore